Large-vocabulary continuous speech recognition using linear lexicon search and 1-best approximation tree-structured lexicon search

نویسندگان

  • Norihide Kitaoka
  • Nobutoshi Takahashi
  • Seiichi Nakagawa
چکیده

The computational cost of a large-vocabulary continuous speech recognition system based on HMM is proportional to its vocabulary size. A tree-structured lexicon is generally used to reduce the number of HMM states. An approximation of the dependence of word boundaries and likelihoods on word histories is also used to limit the increase in the number of hypotheses in the forward decoding procedure. We first compared search algorithms with a tree-structured lexicon using certain approximation methods and algorithms with a linear lexicon. The algorithm based on 1-best approximation with a tree-structured lexicon is efficient but frequently misses the optimal sentence hypothesis. Linear lexicon search can find the optimal hypothesis but has a high computational cost. Thus, we propose a search method using 1-best approximation treestructured lexicon search and linear lexicon search. The words expanded in the linear lexicon are dynamically selected according to likelihood. We evaluated this new search algorithm using read speech, broadcast news speech, and lecture speech, and obtained significant improvement of recognition performance, expanding only 250 to 500 words out of 20,000 words in the linear lexicon. © 2005 Wiley Periodicals, Inc. Syst Comp Jpn, 36(7): 31–39, 2005; Published online in Wiley InterScience (www.interscience. wiley.com). DOI 10.1002/scj.20252

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Programming Search for Continuous Speech Recognition

Initially introduced in the late 1960s and early 1970s, dynamic programming algorithms have become increasingly popular in automatic speech recognition. There are two reasons why this has occurred: First, the dynamic programming strategy can be combined with a very e cient and practical pruning strategy so that very large search spaces can be handled. Second, the dynamic programming strategy ha...

متن کامل

Speech Input Acoustic Analysis Phoneme Inventory Pronunciation Lexicon

This paper gives an overview of an architecture and search organization for large vocabulary, continuous speech recognition (LVCSR at RWTH). In the rst part of the paper, we describe the principle and architecture of a LVCSR system. In particular, the issues of modeling and search for phoneme based recognition are discussed. In the second part, we review the word conditioned lexical tree search...

متن کامل

Speech Input Acoustic Analysis Phoneme Inventory Pronunciation Lexicon Language Model

This paper gives an overview of an architecture and search organization for large vocabulary, continuous speech recognition (LVCSR at RWTH). In the rst part of the paper, we describe the principle and architecture of a LVCSR system. In particular, the issues of modeling and search for phoneme based recognition are discussed. In the second part, we review the word conditioned lexical tree search...

متن کامل

Experimental analysis of the search space for 20 000-word speech recognition

In this paper we investigate the search eeort for large vocabulary continuous speech recognition. In particular, we study the eeect of diierent pruning techniques on the search eeort and on search errors. The experimental results show that it is much more eecient in the search procedure to use a tree lexicon than a linear lexicon. For the tree search method, we study the search space in detail....

متن کامل

Look-ahead Techniques for Improved Beam Search

This paper presents two look-ahead techniques for large vocabulary continuous speech recognition. These two techniques, which are referred to as language model look-ahead and phoneme look-ahead, are incorporated into the pruning process of the time-synchronous one-pass beam search algorithm. The search algorithm is based on a tree-organized pronunciation lexicon in connection with a bigram lang...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Systems and Computers in Japan

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2005